A portable, server-side dialog framework for voiceXML
نویسندگان
چکیده
We describe a spoken dialog application framework that combines the power and flexibility of server-side Java Servlets and Java Server Pages (JSPs) with the deployment portability, reliability and scalability of standard web (HTTP) servers and VoiceXML clients. Applications are developed by extending a framework of Java classes in order to define dialogs through lower level actions such as speech recognition, audio prompting, speech synthesis, and backend data access. The framework delegates session data management to servlets, embedding frame-based representations for the application’s global and session data. Dialog flow is controlled through general constructions such as loops, conditionals, scoped sub-dialogs, along with scoped command, error, and exception handling. Prompting and grammars are configured through simple JSP templates that generate the VoiceXML instructions for the server to return to the client. The framework is designed to be extensible, as demonstrated by the implementation of customizable backup and repeat commands integrated with session data, command handling and grammar scoping. VOICEXML CLIENTS & SERVERS Like an HTML graphical web browser, a VoiceXML client proceeds by requesting data from a VoiceXML server using the HTTP protocol (see Figure 1). The server returns content in the form of a VoiceXML document [1]. The client interprets the document, which may include local computation with ECMAScript [2], synthesizing speech, playing pre-recorded prompts, and performing speech recognition [3]. As part of its VoiceXML processing, the client may make further requests to the same server or to other servers. The client may provide information about speech recognition results or local variables to the server along with a request for a new page. The server updates its state based on the client request and sends an HTTP response containing the appropriate VoiceXML document to the browser. VoiceXML specifies its own dialog control, which controls prompting and synthesis, recognition, telephony event handling, and local ECMAScript evaluation. Control flow may also be transferred to a document specified by a URL either through a direct transfer or a subdialog invocation. Despite this rich set of control structures, it is impossible to build a sophisticated, data-driven application with all of its logic expressed in VoiceXML on the client side, because there is neither a way to store data across sessions, nor a way to access backend services and resources such as databases. Without nonstandard extensions to a VoiceXML client, as in [4], it is not possible to provide personalized dialog flow, customiz ences, any d tate d dialog standa
منابع مشابه
Beyond the Form Interpretation Algorithm Towards flexible dialog management for conversational voice and multi-modal applications
The dialog or voice user interface choices available to application developers in the present version of VoiceXML are largely limited to the capabilities of the Form Interpretation Algorithm (FIA) combined with dynamic server-side generation of VoiceXML. This position paper discusses several improvements aimed at providing flexible dialog management in VoiceXML. The notion of recursive transiti...
متن کاملA VoiceXML Framework for Reusable Dialog Components
VoiceXML [1], or Voice extensible Markup Language, is a special markup language designed to facilitate the creation of speech application, especially interactive voice response (IVR) application. At the difference of conventional IVR programming frameworks, that involve proprietary scripts and programming languages over proprietary / closed platforms, VoiceXML provides a declarative, programmin...
متن کاملTowards voiceXML compilation for portable embedded applications in ubiquitous environments
In this paper we present an approach to embedding VoiceXML applications by an off-line compilation scheme. Our primary motivation is that while VoiceXML is an established standard for voice applications, the complexity and resource requirements of VoiceXML interpretation have so far limited its spread to application areas other than telephony-based services. In many contexts, such as ubiquitous...
متن کاملFlorence: a dialogue manager framework for spoken dialogue systems
Recent advances in speech and language technology have made spoken dialogue systems mainstream in many industries. They allow customers to engage in natural speech interactions with machines instead of being compelled to navigate menus of options with touch tones inputs. VoiceXML was a major milestone for the process of using automated speech applications to expose business portals to ubiquitou...
متن کاملImplementation of dialog applications in an open-source voiceXML platform
In this paper, we study the approach followed to use the VoiceXML standard in a dialog system platform already available in our group. As VoiceXML interpreter we have chosen OpenVXI, an open source portable solution where we can make the modifications needed to adapt the solution to the characteristics of our recognition and synthesis modules; so we will emphasize the changes that we have had t...
متن کامل